% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/decal.R
\name{decal}
\alias{decal}
\title{DECAL: Differential Expression analysis of Clonal Alterations Local effects
based on Negative Binomial distribution}
\usage{
decal(
  perturbations,
  count,
  clone,
  theta = NULL,
  theta_sample = 2000,
  min_mu = 0.05,
  min_n = 3,
  min_x = 1,
  gene_col = "gene",
  clone_col = "clone",
  p_method = "BH"
)
}
\arguments{
\item{perturbations}{table with clone and gene perturbations pairs to model
differential expression effect.}

\item{count}{UMI count matrix with cells as columns and genes (or features)
as rows.}

\item{clone}{list of cells per clone.}

\item{theta}{gene (or features) dispersion}

\item{theta_sample}{number of genes sampled to preliminary \code{theta} estimation.}

\item{min_mu}{minimal overall average expression (\code{mu}) required.}

\item{min_n}{minimal number of perturbed cells (\code{n1}) required.}

\item{min_x}{minimal average expression of perturbed (\code{x1}) and non-perturbed
cells (\code{x0}) required.}

\item{gene_col}{gene index column name in \code{perturbations}}

\item{clone_col}{clone index column name in \code{perturbations}}

\item{p_method}{p-value adjustment for multiple comparisons.
See \verb{\link[stats]\{p.adjust\}}.}
}
\value{
it extends \code{perturbations} table adding the following columns:
\itemize{
\item \code{n0} and \code{n1}: number of non-perturbed and perturbed cells
\item \code{x0} and \code{x1}: number of non-perturbed and perturbed cells average count
\item \code{mu}: overall average expression
\item \code{theta}: negative binomial dispersion parameter
\item \code{xb}: perturbed cells' estimated average count
\item \code{z}: perturbed cells' standardize z-score effect
\item \code{lfc}: perturbed cells' log2 fold-change effect
\item \code{pvalue}
\item \code{p_adjusted}
}
}
\description{
This function performs the clonal alterations differential expression
analysis pairs of clonal sub-populations and perturbed genes.
}
\details{
Given a table of clone and gene pairs, a UMI count matrix, and list of cells
per clone, this function models gene expression (\code{Y}) with a negative
binomial (a.k.a. Gamma-Poisson) distribution for each perturbation pair as
a function of \code{X} (clone indicator variable) offset by the cell total count
(\code{D}) as described by the model:

\deqn{Y \sim NB(xb, theta)}
\deqn{log(xb) = \beta_0 + \beta_x * X + log(D)}
\deqn{theta \sim \mu}

The gene dispersion parameter (\code{theta}) is estimated and regularized in two
steps as developed by Hafemeister & Satija (2019). First, for a subset of
genes it fits a \emph{Poisson} regression offseted by \code{log(D)} and estimate a
crude \code{theta} using a maximum likelihood estimator with the observed counts
and regression results. Next, it regularize and expands \code{theta} estimates
with a kernel smoothing function as a function of average count (\code{mu}).
}
